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In this article we present a survey ... 
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5 The use of web structure and content to identify subjectively interesting wel 
Robert Cooley 

May 2003 ACM Transactions on Internet Technology (TOIT), Volume 3 Issue 

Full text available: "fl) pdf(540.06 KB) Additional Information: full citation, abstract, refere 

The discipline of Web Usage Mining has grown rapidly in the past few years, d 
of the late 1990s. Web Usage Mining is the application of data mining techniq 
extract usage patterns. Yet, with all of the resources put into the problem, cla 
often tied to specific Web site properties that are not found in general. One re 
component of Web Usage Mi ... 

Keywords: Data mining, Web usage mining, World Wide Web 
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Weiyi Meng, Clement Yu, King-Lup Liu 

March 2002 ACM Computing Surveys (CSUR), Volume 34 Issue 1 

Full text available: *f!j pdf(41 6.07 KB) Additional Information: full citation, abstract, reference 

Frequently a user's information needs are stored in the databases of multiple 
inefficient for an ordinary user to invoke multiple search engines and identify 
results. To support unified access to multiple search engines, a metasearch en 
metasearch engine receives a query from a user, it invokes the underlying se 
for the user. Metasearch engines have ... 

Keywords: Collection fusion, distributed collection, distributed information ret 
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7 An investigation of geographic mapping techniques for internet hosts 
Venkata N. Padmanabhan, Lakshminarayanan Subramanian 
August 2001 Proceedings of the 2001 conference on Applications, technologies, ar 
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Full text available^ pdf(31 9.78 KB) Additional Information: full citation, referen 



8 Emergent web patterns: The connectivity sonar: detecting site functionality 
Einat Amitay, David Carmel, Adam Darlow, Ronny Lempel, Aya Softer 
August 2003 Proceedings of the fourteenth ACM conference on Hypertext and 

Full text available:^ pdf(1 53.40 KB) Additional Information: full citation, abstract, refere 

Web sites today serve many different functions, such as corporate sites, searc 
are created for different purposes, their structure and connectivity characteris 
that sites of similar role exhibit similar structural patterns, as the functionalit 
hyperlinked structure and typical connectivity patterns to and from the rest o 
sites is refle ... 
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Eric J. Glover, Kostas Tsioutsiouliklis, Steve Lawrence, David M. Pennock, Gary W 
May 2002 Proceedings of the eleventh international conference on World Wid 

Full text available: pdf(1 36.1 2 KB) Additional Information: full citation, abstract, reference 

The structure of the web is increasingly being used to improve organization, s 
web. For example, Google uses the text in citing documents (documents that 
analyze the relative utility of document text, and the text in citing documents 
description. Results show that the text in citing documents, when available, o 
descriptive power than th ... 
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web structure 
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Luis Gravano, Panagiotis G. Ipeirotis, Mehran Sahami 

January 2003 ACM Transactions on Information Systems (TOIS), Volume 21 

Full text available: fH pdf(3.62 MB) Additional Information: full citation, abstract, reference 

The contents of many valuable Web-accessible databases are only available t 
invisible to traditional Web "crawlers." Recently, commercial Web sites have s 
Web-accessible databases into Yahoo!-like hierarchical classification schemes 
system that automates this classification process by using a small number of 
classifiers. QProber can use a variety of types of ... 

Keywords: Database classification, Web databases, hidden Web 
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12 The state of the art in automating usability evaluation of user interfaces 
December 2001 ACM Computing Surveys (CSUR), Volume 33 Issue 4 

Full text available: ^ pdf(2.31 MB) Additional Information: full citation, abstract, references, citin 

Usability evaluation is an increasingly important part of the user interface des 
can be expensive in terms of time and human resources, and automation is th 
existing approaches. This article presents an extensive survey of usability eva 
new taxonomy that emphasizes the role of automation. The survey analyzes e 
aspects of usability evaluation aut ... 

Keywords: Graphical user interfaces, taxonomy, usability evaluation automati 



13 Personalized spiders for web search and analysis 

Michael Chau, Daniel Zeng, Hinchun Chen 

January 2001 Proceedings of the first ACM/IEEE-CS joint conference on Digita 

Full text available: f|| pdf(672.04 KB) Additional Information: full citation, abstract, reference 

Searching for useful information on the World Wide Web has become incr eas 
engines have been helping people to search on the web, low recall rate and o 
more problematic as the web grows. In addition, search tools usually present 
failing to provide further personalized analysis which could help users identify 
results. To alleviate these ... 

Keywords: information retrieval, internet searching and browsing, internet sp 
self-organizing map 
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Hwanjo Yu, Jiawei Han, Kevin Chen-Chuan Chang 

July 2002 Proceedings of the eighth ACM SIGKDD international conference on Kn 

Full text available: f§ pdf(1 .01 MB) Additional Information: full citation, abstract, referenc 

Web page classification is one of the essential techniques for Web mining. Sp 
user-interesting class is the first step of mining interesting information from t 
for an interesting class requires laborious pre-processing such as collecting po 
instance, in order to construct a "homepage" classifier, one needs to collect a 
and a sample of n ... 
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Full text available: ^ pdf(1 .44 MB) Additional Information: full citation, abstract, 

Web usage mining is the application of data mining techniques to discover us 
understand and better serve the needs of Web-based applications. Web usage 
preprocessing, pattern discovery, and pattern analysis. This paper describes e 
application potential, Web usage mining has seen a rapid increase in interest, 
communities. This pap ... 

Keywords: data mining, web usage mining, world wide web 



16 Mobile networking in the Internet 
Charles E. Perkins 

December 1998 Mobile Networks and Applications, Volume 3 Issue 4 

Full text available:^) pdf(1 66.90 KB) Additional Information: full citation, abstract, reference 

Computers capable of attaching to the Internet from many places are likely to 
population of the Internet. Consequently, protocol research has shifted into h 
protocols for supporting mobility. This introductory article attempts to outline 
interesting research directions. The papers in this special issue indicate the di 
community, and it is ... 

17 Searching the Web 

August 2001 ACM Transactions on Internet Technology (TOIT), Volume 1 Is 
Full text available: H pdf(31 9.98 KB) Additional Information: full citation, abstract, references, cit 

We offer an overview of current Web search engine design. After introducing 
examine each engine component in turn. We cover crawling, local Web page s 
analysis for boosting search performance. The most common design and impl 
components are presented. For this presentation we draw from the literature 
engine testbed. Emphasis is on introduci ... 
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Full text available: pdf(281 .37 KB) Additional Information: full citation, abstract, citin 

We explore how to organize large text databases hierarchically by topic to aid 
Many corpora, such as internet directories, digital libraries, and patent databa 
hierarchies, also called taxonomies. Similar to indices for relational data, taxo 
efficient. However, the exponential growth in the volume of on-line textual in 
maintain such taxono ... 

19 Machine learning in automated text categorization 

Fabrizio Sebastiani 

March 2002 ACM Computing Surveys (CSUR), Volume 34 Issue 1 

Full text available: "@ pdf(524.41 KB) Additional Information: full citation, abstract, reference 

The automated categorization (or classification) of texts into predefined categ 
the last 10 years, due to the increased availability of documents in digital form 
In the research community the dominant approach to this problem is based o 
inductive process automatically builds a classifier by learning, from a set of p 
of the categories. ... 
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Michael M. Swift, Anne Hopkins, Peter Brundrett, Cliff Van Dyke, Praerit Garg, S 
Jensenworth 

November 2002 ACM Transactions on Information and System Security (TISSEC 

Full text available: f|| pdf(447.78 KB) Additional Information: full citation, abstract, references, c 

This article presents the mechanisms in Windows 2000 that enable fine-grain 
for both operating system components and applications. These features were 
NT 4.0 to support the Active Directory, a new feature in Windows 2000, and t 
Internet. While the access control mechanisms in Windows NT are suitable for 
requirements, they fall short of the ... 
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Full text available: *|| pdf(1 92.59 KB) Additional Information: full citation, abstract, refere 

Most text analysis is designed to deal with the concept of a "document", nam 
unifying subject. By contrast, individual nodes on the World Wide Web tend to 
documents. We claim that the notions of "document" and "web node" are not 
to deploy documents as collections of URLs, which we call "compound docume 
techniques for identifying and workin ... 
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January 2002 ACM Transactions on Information Systems (TOIS), Volume 20 

Full text available^ pdf(926.20 KB) Additional Information: full citation, abstract, references, c 

Keyword -based search engines are in widespread use today as a popular mea 
Although such systems seem deceptively simple, a considerable amount of sk 
information needs. This paper presents a new conceptual paradigm for perform 
automates the search process, providing even non-professional users with hig 
implemented in practice in the Intelli ... 

Keywords: Search, context, invisible web, semantic processing, statistical nat 



24 Extracting usability information from user interface events 
David M. Hilbert, David F. Redmiles 

December 2000 ACM Computing Surveys (CSUR), Volume 32 Issue 4 

Full text available: pdf(1 .50 MB) Additional Information: full citation, abstract, references, citin 

Modern window-based user interface systems generate user interface events 
operation. Because such events can be automatically captured and because th 
an application's user interface, they have long been regarded as a potentially 
application usage and usability. However, because user interface events are t 
automated support is generally ... 
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Yuval Elovici, Bracha Shapira, Adlai Maschiach 

November 2002 Proceeding of the ACM workshop on Privacy in the Electroni 

Full text available:"^ pdf(1 04.1 9 KB) Additional Information: full citation, abstract, refere 

This paper presents a new privacy model for hiding the information interests 
a local area network and an access point to the Web. The suggested model is 
using identifiable members' tracks to infer the group common interests (refer 
members of the group to identify themselves to various services. The model c 
various fields of interest ... 

Keywords: Web-security, privacy, user-groups, user-profile 
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Loren Terveen, Will Hill, Brian Amento 
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Full text available: pdf(303.62 KB) Additional Information: full citation, abstract, reference 

For many purposes, the Web page is too small a unit of interaction and analys 
documents consisting of many pages, and users often are interested in obtain 
topically related sites. Once such a collection is obtained, users face the chall 
organizing the items. We report four innovations that address these user need 
Web site 
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Anja Feldmann, Albert Greenberg, Carsten Lund, Nick Reingold, Jennifer Rexford 
June 2001 IEEE/ACM Transactions on Networking (TON), Volume 9 Issue 3 

Full text available: f| pdf(21 2.92 KB) Additional Information: full citation, abstract, refere 

Engineering a large IP backbone network without an accurate network-wide v 
Shifts in user behavior, changes in routing policies, and failures of network el 
sudden) fluctuations in load. In this paper, we present a model of traffic dema 
performance debugging of large Internet Service Provider networks. By defini 
originating from an ingres ... 
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Steven John Simon 
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Full text available: "fH pdf(1 .88 MB) Additional Information: full citation, abstract, referenc 

The growth of electronic commerce, in particular business-to-consumer, has b 
Until recently, the Web community has been a male dominated western-orien 
reflecting that homogenous audience. Using an adapted version of Hofstede's 
this study explores the perception and satisfaction levels of one hundred and 
indicates that perception and ... 
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May 2002 ACM Transactions on Internet Technology (TOIT), Volume 2 Issue 
Full text available: pdf(586.33 KB) Additional Information: full citation, abstract, refere 

We present the results of the <bigwig> project, which aims to 
domain-specific language for programming interactive Web ser 

A fundamental aspect of the development of the World Wide Web d 
change from static to dynamic generation of Web pages. Generating 
with the client has the advantage of providing up-to-date and tailor 
of systems ... 
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Full text available: ^ pdf(1 .12 MB) Additional Information: full citation, abstract, 

Traditional intrusion detection systems (IDS) detect attacks by comparing cur 
attacks. One main drawback is the inability of detecting new attacks which do 
we propose a learning algorithm that constructs models of normal behavior fr 
that deviates from the learned normal model signals possible novel attacks. O 
nonstationary, modeling pr ... 
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Full text available: "@ pdf(1 .26 MB) Additional Information: full citation, abstract, references, citin 

This article presents a customizable architecture for software agents that capt 
heterogeneous, distributed electronic repositories. The key idea is to exploit u 
granularity to build high-level indices with task-specific interpretations. Inform 
are configured as a network of reusable modules called structure detectors an 
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Osmar R. Zaiane, Jiawei Han, Ze-Nian Li, Jean Hou 

November 1998 Proceedings of the 1998 conference of the Centre for Advanced S 
Full text available: ^ pdf(377.84 KB) Additional Information: full citation, abstract, refere 

Data Mining is a young but flourishing field. Many algorithms and applications 
extract different types of knowledge. Mining multimedia data is, however, at 
implemented a prototype for mining high-level multimedia information and kn 
MultiMedia Miner has been designed based on our years of experience in the r 
. data mining system, DBMiner, in the Inte ... 
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Full text available: ^ pdf(330.06 KB) Additional Information: full citation, abstract, 

Recently there has been significant development in the use of wavelet metho 
However, there has been written no comprehensive survey available on the to 
void. First, the paper presents a high-level data-mining framework that reduc 
components. Then applications of wavelets for each component are reviewd. T 
impact of wavelets on data mining research an ... 
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Weiyi Meng, Zonghuan Wu, Clement Yu, Zhuogang Li 
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Full text available: pdf(653.63 KB) Additional Information: full citation, abstract, refere 

A metasearch engine is a system that supports unified access to multiple loca 
of the main challenges in building a large-scale metasearch engine. The probl 
determine a small number of potentially useful local search engines to invoke 
accurate selection, metadata that reflect the contents of each search engine n 
proposes a highly scalable ... 
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The web contains a wealth of product reviews, but sifting through them is a d 
tool would process a set of search results for a given item, generating a list o 
and aggregating opinions about each of them (poor, mixed, good). We begin 
problem and develop a method for automatically distinguishing between posit 
draws on information ... 
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The need for rapid deployment and user mobility suggest the i 
satellite&dash;wireless network infrastructure for important sit 
response applications. An Intelligent Information Disseminatioi 
been developed to support the dissemination and maintenance 
throughout such a network information infrastructure in a sean 
IIDS is to transparently handle the mismatches ... 
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Full text available: "jf!) pdf(281 .14 KB) Additional Information: full citation, abstrac 

The analysis of web usage has mostly focused on sites composed of conventio 
of information available in the web come from databases or other data collect 
form of dynamically generated pages. The query interfaces of such sites allow 
Their generated results support navigation to pages of results combining cros 
analysis of visitor naviga ... 
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